New Data Structures for Matrices and Specialized Inner Kernels: Low Overhead for High Performance

نویسنده

  • José R. Herrero
چکیده

Dense linear algebra codes are often expressed and coded in terms of BLAS calls. This approach, however, achieves suboptimal performance due to the overheads associated to such calls. Taking as an example the dense Cholesky factorization of a symmetric positive definite matrix we show that the potential of non-canonical data structures for dense linear algebra can be better exploited with the use of specialized inner kernels. The use of non-canonical data structures together with specialized inner kernels has low overhead and can produce excellent performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exposing Inner Kernels and Block Storage for Fast Parallel Dense Linear Algebra Codes⋆

Efficient execution on processors with multiple cores requires the exploitation of parallelism within the processor. For many dense linear algebra codes this, in turn, requires the efficient execution of codes which operate on relatively small matrices. Efficient implementations of dense Basic Linear Algebra Subroutines exist (BLAS libraries). However, calls to BLAS libraries introduce large ov...

متن کامل

Performance Optimizations and Bounds for Sparse Matrix Kernels

Building high-performance implementations of sparse matrix-vector multiply (SpM×V), an important and ubiquitous computational kernel, is fundamentally limited by a variety of factors: the increasing performance gap between processors and memory, the storage and instruction overhead of manipulating sparse data structures, and the irregular memory access due to sparse storage. Moreover, the compl...

متن کامل

Applicability of Pattern-based sparse matrix representation for real applications

Pattern-based representation (PBR) is a novel sparse matrix representation that reduces the index overhead for many matrices without zero-filling and without requiring the identification of dense matrix blocks. The PBR analyzer identifies recurring block nonzero patterns, represents the submatrix consisting of all blocks of this pattern in block coordinate format, and generates custom matrix-ve...

متن کامل

Application of Independent Joint Control Strategy for Discrete-Time Servo Control of Overhead Cranes

In this study, a new servo control system is presented for the overhead crane based on discrete-time state feedback approach. It provides both robust tracking and load swing suppression. Inspired from independent joint and computed torque control in robot manipulator field, a new model is derived in which the crane actuators are considered as the main plant. The crane nonlinearities are then tr...

متن کامل

A hybrid model based on machine learning and genetic algorithm for detecting fraud in financial statements

Financial statement fraud has increasingly become a serious problem for business, government, and investors. In fact, this threatens the reliability of capital markets, corporate heads, and even the audit profession. Auditors in particular face their apparent inability to detect large-scale fraud, and there are various ways to identify this problem. In order to identify this problem, the majori...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007